Hardening: input validation and bounds tightening across 28 subsystems (round 2)#31175
Conversation
|
Updated 8:59 PM PT - May 21st, 2026
✅ @Jarred-Sumner, your commit c041aa6cb296fdb64b1dce30f854971067433d7d passed in 🧪 To try this PR locally: bunx bun-pr 31175That installs a local version of the PR into your bun-31175 --bun |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds repository-wide hardening: decompression limits, installer/linker symlink and alias protections, WASI sandboxing, WebSocket/origin/header guards, Windows batch-arg safety, shell brace fixes, input validation, incremental Valkey scanning/budgeting, YAML merge limits, and several memory/stack-safety fixes. ChangesDecompression Bomb Protection
Installation Path & Alias Validation
WASI Sandbox Path Traversal Prevention
WebSocket & Dev-Server Security
Windows Batch File Argument Safety
Shell Expansion Robustness
Input Validation & Format Hardening
Valkey Protocol Incremental Scanning & Budgeting
YAML Merge-Key Expansion Limits
Memory & Stack Safety
Possibly related PRs
Suggested reviewers
|
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/install/TarballStream.rs (1)
1284-1326: 🧹 Nitpick | 🔵 Trivial | ⚖️ Poor tradeoffConsider adding
NOFOLLOW_ANYflag for consistency with buffered extractor.The buffered extractor in
lib.rs(lines 2007-2015) addsO::NOFOLLOW_ANYwhencreated_symlinksis non-empty to defend against Unicode normalization attacks on APFS/HFS+. The streaming path'sopen_output_filedoesn't have this defense-in-depth.While the lexical
path_traverses_created_symlinkcheck catches most traversals, the comment in lib.rs explains that filesystems with Unicode NFC/NFD normalization (macOS) can alias paths that differ at the byte level. TheNOFOLLOW_ANYflag mitigates this.Consider passing a
has_symlinks: boolparameter to enable the flag when symlinks have been created.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/install/TarballStream.rs` around lines 1284 - 1326, open_output_file currently never sets O::NOFOLLOW_ANY, so the streaming extractor lacks the same APFS/HFS+ NFC/NFD defense as the buffered extractor; add a new parameter (e.g., has_symlinks: bool) to open_output_file and, when true, add O::NOFOLLOW_ANY to the flags before calling bun_sys::openat (and to the flags passed to bun_sys::openat_windows if the platform supports an equivalent); update callers to pass true when created_symlinks (or similar) is non-empty so the streaming path uses NOFOLLOW_ANY consistent with the buffered extractor.src/jsc/bindings/sqlite/JSSQLStatement.cpp (1)
2228-2238:⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy liftFix SQLite ownership claim and address potential
sqlite3handle leak
SQLITE_DESERIALIZE_FREEONCLOSEdoes transfer ownership of the buffer to SQLite; onsqlite3_deserialize()failure SQLite frees the provided buffer for you, so the “don’t free on failure” comment is consistent with SQLite docs (the earlier claim to the contrary is incorrect).- The error paths after
sqlite3_open_v2(":memory:", &db, ...)return without callingsqlite3_close/sqlite3_close_v2, which can leak the opened in-memory database handle whensqlite3_deserialize()returnsSQLITE_BUSYor other non-SQLITE_OKstatuses.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/jsc/bindings/sqlite/JSSQLStatement.cpp` around lines 2228 - 2238, The code incorrectly states ownership behavior of SQLITE_DESERIALIZE_FREEONCLOSE and also can leak the in-memory DB handle when sqlite3_deserialize or related steps fail: update/remove the incorrect "don't free on failure" comment for SQLITE_DESERIALIZE_FREEONCLOSE (SQLite will free the buffer on sqlite3_deserialize failure), and ensure any path that returns after a successful sqlite3_open_v2 (e.g., sqlite3_deserialize returning SQLITE_BUSY or other non-SQLITE_OK) calls sqlite3_close or sqlite3_close_v2 on the opened db before returning; locate and modify the code around sqlite3_open_v2, sqlite3_deserialize, and error-return branches to call sqlite3_close_v2(db) on error and preserve the correct ownership semantics for SQLITE_DESERIALIZE_FREEONCLOSE.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/install/lockfile/Package/Scripts.rs`:
- Around line 312-325: The trusted-dependency check incorrectly passes the alias
twice to lockfile.has_trusted_dependency; change the call in the
add_node_gyp_rebuild_script logic to pass the alias (folder_name) as the alias
parameter and the resolved package name/slice (the real pkg name, e.g.
pkg_name.slice(...) or resolution-derived name) as the package-name parameter so
alias-aware trust matching works; also update any get_list(...) call sites
mentioned to similarly pass the resolved package name alongside the alias so the
trust lookup uses the real package name, not the alias twice.
In `@src/js/internal/debugger.ts`:
- Around line 622-624: The origin comparison currently checks the raw origin
string (variable origin) which can mismatch on equivalent forms; change the
check to use a normalized URL origin (use url.origin if a URL object is already
available, otherwise construct one via new URL(origin).origin) when comparing
against "https://debug.bun.sh" so equivalents like "https://debug.bun.sh:443"
match consistently; update the conditional that uses origin to compare against
url.origin (or the new URL(...).origin) instead.
In `@src/jsc/bindings/webcore/SerializedScriptValue.cpp`:
- Around line 4070-4074: The current check only ensures there are >= 12 bytes
per extra prime but then unconditionally preallocates
Vector<CryptoKeyRSAComponents::PrimeInfo>(primeCount - 2), which lets an
attacker force a large heap allocation; change the allocation to be bounded by
the remaining-wire-bytes: compute a capped count like size_t maxPrimes =
std::min<uint64_t>(primeCount - 2, static_cast<uint64_t>(m_end - m_ptr) / (3 *
sizeof(uint32_t))); then use that capped value when constructing or reserving
the Vector for otherPrimeInfos (or avoid immediate full allocation and push_back
per-prime after validating each length), keeping the per-prime validation loop
unchanged (refer to primeCount, m_ptr, m_end, and
CryptoKeyRSAComponents::PrimeInfo/otherPrimeInfos to locate the code).
In `@src/parsers/yaml.rs`:
- Around line 2322-2326: The doc comment for merge_key_comparisons is
misleading: the counter is a field on Parser and not reset by parse_document(),
so its scope is per Parser/stream rather than per document; update the comment
for merge_key_comparisons (and the other occurrence around 3100-3104) to state
that the comparisons are counted across the entire parse invocation or
per-stream, and mention that parse_document() clears anchors and tag_handles but
does not reset merge_key_comparisons so the limit applies cumulatively for the
Parser instance.
In `@src/runtime/api/bun/h2_frame_parser.rs`:
- Around line 4445-4458: The guard uses get_session_memory_usage() which returns
a MiB-truncated value, so comparing it to max_session_memory (via
self.max_session_memory.get()) lets sessions exceed the intended byte ceiling;
update the check in lookup_inbound_stream (and the other similar guard site) to
compare byte-accurate session usage against the configured ceiling by converting
max_session_memory into bytes or by using a byte-returning variant of
get_session_memory_usage(), and then call send_go_away (using stream_identifier,
ErrorCode::ENHANCE_YOUR_CALM, b"ENHANCE_YOUR_CALM", self.last_stream_id.get(),
true) when the byte-level usage exceeds the limit so the configured cap is
actually enforced.
In `@src/runtime/bake/DevServer.rs`:
- Around line 1458-1469: host_without_port currently strips everything after a
']' or the last ':' without validating that the trailing bytes are a numeric
port, which lets malformed authorities like "localhost:abc" or "[::1]evil" pass;
update host_without_port to validate the port portion: for bracketed hosts
(first byte '[') ensure a matching ']' exists and that any bytes after ']' are
either empty or a single ':' followed only by ASCII digits, and for
non-bracketed hosts, if a ':' exists ensure the suffix after the last ':' is all
ASCII digits; if the port validation fails, return an "invalid" marker (e.g., an
empty slice) so callers like is_allowed_dev_host and is_allowed_dev_origin will
fail closed rather than silently normalizing malformed input.
In `@src/runtime/bake/DevServer/ErrorReportRequest.rs`:
- Around line 585-599: The sanitize_for_terminal function currently treats only
C0 control bytes and DEL as disallowed; update the is_disallowed predicate used
by sanitize_for_terminal to also treat C1 control codes (0x80..=0x9F) as
disallowed so they get replaced with spaces; locate the nested is_disallowed
function in sanitize_for_terminal (and the loop that replaces bytes in the
Arena-allocated copy) and extend its condition to return true for bytes in the
0x80-0x9F range.
In `@src/sql_jsc/postgres/PostgresSQLConnection.rs`:
- Around line 2866-2878: The new check incorrectly treats
AuthenticationState::Sasl(_) as still "in-progress" even after a successful
SASLFinal because SASLFinal only calls s.zero() and does not change the enum;
update the SASLFinal success path (the code handling SASLFinal where s.zero() is
called) to transition the authentication state out of the Sasl variant (e.g.,
set self.authentication_state to None or to a new SaslComplete variant) so that
subsequent AuthenticationOk is accepted; ensure the match at AuthenticationOk
(matching self.authentication_state.get() against AuthenticationState::Sasl(_))
no longer triggers after successful SASLFinal and that sensitive data is still
zeroed via s.zero().
In `@src/which/lib.rs`:
- Around line 148-155: is_batch_file currently checks for a literal
".cmd"/".bat" at the very end of the byte slice, which misses cases like
"foo.cmd " or "foo.cmd." that Windows treats as the same file; update
is_batch_file to trim trailing ASCII spaces (0x20) and periods (0x2E) from the
end of the provided path bytes before performing the extension-length check and
calling strings::eql_case_insensitive_asciii_check_length; implement trimming by
walking a mutable end index down while path[end-1] is b' ' or b'.' (without
allocating), then run the existing length and extension comparisons against the
trimmed end.
---
Outside diff comments:
In `@src/install/TarballStream.rs`:
- Around line 1284-1326: open_output_file currently never sets O::NOFOLLOW_ANY,
so the streaming extractor lacks the same APFS/HFS+ NFC/NFD defense as the
buffered extractor; add a new parameter (e.g., has_symlinks: bool) to
open_output_file and, when true, add O::NOFOLLOW_ANY to the flags before calling
bun_sys::openat (and to the flags passed to bun_sys::openat_windows if the
platform supports an equivalent); update callers to pass true when
created_symlinks (or similar) is non-empty so the streaming path uses
NOFOLLOW_ANY consistent with the buffered extractor.
In `@src/jsc/bindings/sqlite/JSSQLStatement.cpp`:
- Around line 2228-2238: The code incorrectly states ownership behavior of
SQLITE_DESERIALIZE_FREEONCLOSE and also can leak the in-memory DB handle when
sqlite3_deserialize or related steps fail: update/remove the incorrect "don't
free on failure" comment for SQLITE_DESERIALIZE_FREEONCLOSE (SQLite will free
the buffer on sqlite3_deserialize failure), and ensure any path that returns
after a successful sqlite3_open_v2 (e.g., sqlite3_deserialize returning
SQLITE_BUSY or other non-SQLITE_OK) calls sqlite3_close or sqlite3_close_v2 on
the opened db before returning; locate and modify the code around
sqlite3_open_v2, sqlite3_deserialize, and error-return branches to call
sqlite3_close_v2(db) on error and preserve the correct ownership semantics for
SQLITE_DESERIALIZE_FREEONCLOSE.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4d6fcf52-40eb-4201-ad84-aa7bc1e45f7b
📒 Files selected for processing (55)
packages/bun-uws/src/Http3Context.hpackages/bun-uws/src/HttpParser.hsrc/ast/e.rssrc/base64/lib.rssrc/brotli/lib.rssrc/bun_core/util.rssrc/bundler/linker_context/postProcessCSSChunk.rssrc/bundler/linker_context/postProcessJSChunk.rssrc/http/Decompressor.rssrc/http_jsc/websocket_client/WebSocketUpgradeClient.rssrc/ini/lib.rssrc/install/PackageInstaller.rssrc/install/TarballStream.rssrc/install/bin.rssrc/install/isolated_install.rssrc/install/isolated_install/Installer.rssrc/install/lockfile.rssrc/install/lockfile/Package.rssrc/install/lockfile/Package/Scripts.rssrc/js/internal/debugger.tssrc/js/internal/validators.tssrc/js/node/net.tssrc/js/node/tls.tssrc/js/node/wasi.tssrc/jsc/bindings/node/http/NodeHTTPParser.cppsrc/jsc/bindings/sqlite/JSSQLStatement.cppsrc/jsc/bindings/webcore/SerializedScriptValue.cppsrc/libarchive/lib.rssrc/parsers/yaml.rssrc/runtime/api/bun/h2_frame_parser.rssrc/runtime/api/bun/js_bun_spawn_bindings.rssrc/runtime/bake/DevServer.rssrc/runtime/bake/DevServer/ErrorReportRequest.rssrc/runtime/bake/mod.rssrc/runtime/cli/bunx_command.rssrc/runtime/cli/create_command.rssrc/runtime/cli/pm_trusted_command.rssrc/runtime/crypto/pwhash.rssrc/runtime/server/server_body.rssrc/runtime/shell/states/Cmd.rssrc/runtime/shell/states/Expansion.rssrc/runtime/valkey_jsc/js_valkey.rssrc/runtime/valkey_jsc/valkey.rssrc/runtime/webcore/Blob.rssrc/semver/SemverQuery.rssrc/shell_parser/parse.rssrc/sql/mysql/protocol/StackReader.rssrc/sql_jsc/mysql/MySQLConnection.rssrc/sql_jsc/postgres/PostgresSQLConnection.rssrc/sys/lib.rssrc/valkey/valkey_protocol.rssrc/which/lib.rssrc/zlib/lib.rssrc/zstd/lib.rstest/js/node/http/early-hints-crlf-injection.test.ts
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/runtime/bake/DevServer.rs (1)
1461-1475:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReject raw IPv6 authorities and empty ports here.
This still normalizes some malformed authorities into allowed hosts:
localhost:/[::1]:pass becauseall()is true on an empty slice, and unbracketed IPv6-like inputs such as2001:db8::1:80are treated ashost + portinstead of being rejected. That weakens the fail-closed contract for bothis_allowed_dev_host()andis_allowed_dev_origin().Suggested fix
fn host_without_port(host: &[u8]) -> &[u8] { let (host, rest) = if host.first() == Some(&b'[') { match strings::index_of_scalar(host, b']') { Some(end) => (&host[..=end], &host[end + 1..]), None => return b"", } } else { - match strings::last_index_of_char(host, b':') { - Some(colon) => (&host[..colon], &host[colon..]), + let first_colon = strings::index_of_scalar(host, b':'); + match (first_colon, strings::last_index_of_char(host, b':')) { + (Some(first), Some(last)) if first != last => return b"", + (_, Some(colon)) => (&host[..colon], &host[colon..]), None => (host, &host[host.len()..]), } }; match rest { [] => host, - [b':', port @ ..] if port.iter().all(u8::is_ascii_digit) => host, + [b':', port @ ..] if !port.is_empty() && port.iter().all(u8::is_ascii_digit) => host, _ => b"", } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/runtime/bake/DevServer.rs` around lines 1461 - 1475, The current parsing accepts empty ports and treats unbracketed IPv6-like authorities as host:port; update the logic around (host, rest) and the final match so empty ports are rejected and unbracketed inputs containing multiple ':' are not treated as host+port. Concretely: in the final match change the port arm to require non-empty port (e.g. [b':', port @ ..] if !port.is_empty() && port.iter().all(u8::is_ascii_digit) => host), and in the non-bracketed branch (the else that calls strings::last_index_of_char) detect multiple ':' in host (count ':' on the original host slice) and return b"" if more than one colon (reject raw IPv6 authorities); keep references to host, rest, strings::index_of_scalar, strings::last_index_of_char, and ensure is_allowed_dev_host()/is_allowed_dev_origin() will thus see rejected inputs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/runtime/bake/DevServer.rs`:
- Around line 1461-1475: The current parsing accepts empty ports and treats
unbracketed IPv6-like authorities as host:port; update the logic around (host,
rest) and the final match so empty ports are rejected and unbracketed inputs
containing multiple ':' are not treated as host+port. Concretely: in the final
match change the port arm to require non-empty port (e.g. [b':', port @ ..] if
!port.is_empty() && port.iter().all(u8::is_ascii_digit) => host), and in the
non-bracketed branch (the else that calls strings::last_index_of_char) detect
multiple ':' in host (count ':' on the original host slice) and return b"" if
more than one colon (reject raw IPv6 authorities); keep references to host,
rest, strings::index_of_scalar, strings::last_index_of_char, and ensure
is_allowed_dev_host()/is_allowed_dev_origin() will thus see rejected inputs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: cbc2bd10-cba4-434f-b981-8216c6008a73
📒 Files selected for processing (6)
src/js/internal/debugger.tssrc/jsc/bindings/webcore/SerializedScriptValue.cppsrc/parsers/yaml.rssrc/runtime/bake/DevServer.rssrc/runtime/bake/DevServer/ErrorReportRequest.rssrc/which/lib.rs
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/install/bin.rs (1)
1605-1607:⚠️ Potential issue | 🟠 Major | ⚡ Quick winApply the same escape guard in
unlink()for directory bins.This only hardens
link().Linker::unlink()still resolvesTag::Dirtargets without validation, so a package installed before this change can still walk outside its package dir during removal and prune unrelated.binentries.Suggested fix
Tag::Dir => { let dir = self.bin.value.dir; let target = dir.slice(self.string_buf); - if target.is_empty() { + if target.is_empty() || bin_target_escapes_package_dir(target) { return; } let abs_target_dir = resolve_path::join_abs_string_z::<PlatformAuto>(package_dir, &[target]);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/install/bin.rs` around lines 1605 - 1607, Linker::unlink() currently resolves Tag::Dir targets without validating the directory target; add the same escape guard used in link() by computing the target string (using dir.slice(self.string_buf) or the same helper path) and checking if target.is_empty() || bin_target_escapes_package_dir(target) before proceeding to unlink, and return early on failure. Ensure the check mirrors the one in link() so directory bins that escape the package dir are not removed and existing .bin entries outside the package tree are preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/install/bin.rs`:
- Around line 769-775: The check that rejects any target containing b':' is too
broad; restrict it to Windows-specific drive-relative or ADS forms. Replace the
unconditional if target.contains(&b':') { ... } with a condition that only
triggers for Windows-style patterns (e.g., a leading drive-letter + colon like
/^[A-Za-z]:/ or a colon followed by something that is not a path separator
indicating an alternate-data-stream) and/or when cfg!(windows); update the
branch that returns true to use this narrowed test while leaving other logic
(the surrounding depth/walk checks on target) unchanged so Unix filenames with
':' are accepted.
In `@src/runtime/api/bun/h2_frame_parser.rs`:
- Around line 5535-5550: The current memory accounting filters out
StreamState::CLOSED entries so sequentially-closed streams on a long-lived
connection are not charged; either stop filtering here and count all entries
(replace the live_streams computation with self.streams.get().iter().count()) so
maxSessionMemory includes closed-but-resident Stream allocations, or
alternatively implement pruning of closed streams when they transition to CLOSED
(remove their entry from self.streams at the close/teardown path—e.g., where
lookup_inbound_stream()/stream close logic runs) so that keeping the filter
remains correct.
---
Outside diff comments:
In `@src/install/bin.rs`:
- Around line 1605-1607: Linker::unlink() currently resolves Tag::Dir targets
without validating the directory target; add the same escape guard used in
link() by computing the target string (using dir.slice(self.string_buf) or the
same helper path) and checking if target.is_empty() ||
bin_target_escapes_package_dir(target) before proceeding to unlink, and return
early on failure. Ensure the check mirrors the one in link() so directory bins
that escape the package dir are not removed and existing .bin entries outside
the package tree are preserved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 73827d4a-caa7-43c9-b622-5bb133a36ecb
📒 Files selected for processing (5)
src/install/bin.rssrc/runtime/api/bun/h2_frame_parser.rssrc/runtime/cli/bunx_command.rssrc/runtime/cli/create_command.rssrc/shell_parser/parse.rs
25eca4a to
76931c0
Compare
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
src/runtime/shell/states/Expansion.rs (1)
645-656: 🧹 Nitpick | 🔵 Trivial | 💤 Low valueConsider clearing
brace_meta_offsetsindeinitfor consistency.Other working buffers (
out.buf,out.bounds,current_out) are explicitly cleared. Clearingbrace_meta_offsetswould maintain the same pattern and avoid holding onto its allocation if node pooling is ever introduced.Suggested fix
me.out.buf.clear(); me.out.bounds.clear(); me.current_out.clear(); + me.brace_meta_offsets.clear(); me.base.end_scope();🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/runtime/shell/states/Expansion.rs` around lines 645 - 656, The deinit function for Expansion currently clears out.buf, out.bounds, and current_out but doesn't clear brace_meta_offsets; update the deinit implementation (inside the pub fn deinit(interp: &Interpreter, this: NodeId) function) to also clear the Expansion's brace_meta_offsets (use the local mutable reference `me` from interp.as_expansion_mut(this) and call clear() on its brace_meta_offsets) so it releases any held allocation consistently with the other buffers.src/jsc/bindings/webcore/SerializedScriptValue.cpp (1)
4648-4667:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReject PEM payloads larger than OpenSSL’s
BIO_new_mem_bufcan represent.
BIO_new_mem_buftakesint len(BIO_new_mem_buf(const void *buf, int len)). The code readspemSizeasuint64_tand passes it directly toBIO_new_mem_buf(m_ptr, length); values aboveINT_MAXwill truncate on the int conversion before parsing, even though the buffer bounds check (m_end - m_ptr < length) passes. Negative lengths aren’t relevant here since the size is unsigned.🛡️ Proposed fix
case CryptoKeyType::Public: case CryptoKeyType::Private: { uint64_t pemSize = 0; if (!read(pemSize)) { fail(); return JSValue(); } + if (pemSize > static_cast<uint64_t>(std::numeric_limits<int>::max())) { + fail(); + return JSValue(); + } BIO* rawBio = nullptr; if (!read(&rawBio, pemSize)) { fail(); return JSValue();🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/jsc/bindings/webcore/SerializedScriptValue.cpp` around lines 4648 - 4667, The code reads pemSize as a uint64_t and later passes it to BIO_new_mem_buf via read(&rawBio, pemSize), which can overflow because BIO_new_mem_buf takes an int; add an explicit check that pemSize <= std::numeric_limits<int>::max() (and > 0 if desired) before calling read/BIO creation and reject (call fail() and return JSValue()) when the size is too large; update the logic around the read(&rawBio, pemSize) call in SerializedScriptValue.cpp so oversized PEM payloads are rejected prior to creating ncrypto::BIOPointer and before calling PEM_read_bio_PUBKEY / PEM_read_bio_PrivateKey (references: pemSize, read, ncrypto::BIOPointer, PEM_read_bio_PUBKEY, PEM_read_bio_PrivateKey, KeyObject, JSPublicKeyObject::create).src/parsers/yaml.rs (1)
3175-3196:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not silently drop
<<when a merge array contains non-object entries.The scalar/object fallback preserves the original property, but the array branch just skips non-object items and still returns
Ok(()). For inputs like<<: [1]or<<: [{a: 1}, 2], that loses user data instead of preserving the literal<<property or rejecting it.Suggested fix
match &value.data { ast::ExprData::EObject(value_obj) => { self.merge(value_obj.properties.slice(), comparisons) } - ast::ExprData::EArray(value_arr) => { + ast::ExprData::EArray(value_arr) + if value_arr + .items + .slice() + .iter() + .all(|item| matches!(&item.data, ast::ExprData::EObject(_))) => + { for item in value_arr.items.slice() { - let item_obj = match &item.data { - ast::ExprData::EObject(obj) => obj, - _ => continue, - }; + let ast::ExprData::EObject(item_obj) = &item.data else { + unreachable!(); + }; self.merge(item_obj.properties.slice(), comparisons)?; } Ok(()) } _ => {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/parsers/yaml.rs` around lines 3175 - 3196, The EArray branch currently skips non-object items and loses the original `<<` property; change it to detect any non-object entries and fall back to preserving the original `<<` property (same behavior as the scalar/object fallback) instead of silently dropping data. Concretely, in the match arm handling ast::ExprData::EArray(value_arr) iterate items and if you encounter a non-object item push the original property into self.list (using G::Property { key: Some(key), value: Some(value), ..Default::default() }) and return Ok(()); only call self.merge(...) for arrays where every item is an object (still returning Ok(()) after merging), and keep using the same comparisons argument.src/js/node/wasi.ts (1)
1591-1598:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
path_openreturns success after logging non-WASI errors.When a non-
WASIErrorexception is caught, the code logs it but then falls through to returnWASI_ESUCCESS(line 1597). This is pre-existing behavior, but the new containment checks make it more likely that legitimate errors (e.g., fromfs.openSync) get swallowed.Consider returning
WASI_EIOorWASI_EINVALafter logging:Suggested fix
} catch (e) { if (e instanceof types_1.WASIError) { return e.errno; } console.error(e); + return constants_1.WASI_EIO; } - return constants_1.WASI_ESUCCESS; },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/js/node/wasi.ts` around lines 1591 - 1598, In path_open's catch block (the one checking instanceof types_1.WASIError), stop falling through to constants_1.WASI_ESUCCESS for non-WASI exceptions: after console.error(e) return an appropriate errno such as constants_1.WASI_EIO (or constants_1.WASI_EINVAL if the error indicates a bad argument) so unexpected JS/fs errors aren’t treated as success; keep the existing branch that returns e.errno for types_1.WASIError unchanged.
♻️ Duplicate comments (1)
src/install/bin.rs (1)
776-780:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDon't reject Unix paths just because the first segment contains
:.This still skips valid in-package bin targets on Unix, e.g.
bin/foo:cli. The rejection should stay limited to Windows drive-prefixed / ADS forms instead of treating any first-component colon as an escape.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/install/bin.rs` around lines 776 - 780, The current check on target (using target.split(...).next().is_some_and(|first| first.contains(&b':'))) incorrectly rejects Unix package bins like "bin/foo:cli"; change the predicate to detect only Windows drive-letter prefixes (e.g., a two-byte segment where first is ASCII letter and second is b':'). In other words, keep using target.split(|&b| b == b'/' || b == b'\\').next() but replace the closure to return true only when first.len() == 2 && first[1] == b':' && first[0].is_ascii_alphabetic(), so only drive-prefixed paths are rejected.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/brotli/lib.rs`:
- Around line 244-247: The check against max_output_size is only in the
needs_more_output branch, allowing a successful decode path to append data and
exceed max_output_size; update the decode loop (the function that manipulates
self.list_ptr and transitions ReaderState) so that after any append/push to
self.list_ptr (including in the success path) you immediately check if
self.list_ptr.len() > self.max_output_size and, if so, set self.state =
ReaderState::Error and return the BrotliDecompressionError; ensure this check
sits right after each place that grows list_ptr (not just in needs_more_output)
so no branch can bypass the cap.
In `@src/bun_core/util.rs`:
- Around line 5789-5815: get_boundary incorrectly splits on every ';' (using
index_of_char) even when that semicolon is inside a quoted-string, so boundary=
can be picked from another parameter's quoted value; change the loop in
get_boundary to scan rest byte-by-byte and find the next parameter separator
only when not inside a quoted-string: maintain an in_quote boolean and an escape
handling (treat backslash as escaping the next byte inside quotes) while
iterating to find the next unquoted ';' index instead of calling
crate::strings_impl::index_of_char(rest, b';'); then use that index to advance
rest, keep using crate::strings_impl::trim_left and strip_prefix(b"boundary=")
and preserve the existing quoted boundary handling (strip surrounding quotes)
and empty-boundary check.
In `@src/parsers/yaml.rs`:
- Around line 3126-3131: The code currently charges comparisons by adding
self.list.len() up front before scanning for duplicate merge keys, which
overcounts; instead, stop precharging and increment the comparisons counter each
time you actually perform a key equality check inside the duplicate-scan loop.
Concretely, in the 'next_merge_prop' loop where you get
merge_prop.key.as_ref().unwrap(), remove the '*comparisons =
comparisons.saturating_add(self.list.len() as u64)' precharge and instead
increment *comparisons by 1 immediately before or when comparing merge_key to
the items in self.list (checking against Self::MAX_MERGE_KEY_COMPARISONS and
returning ParseError::MergeKeyLimitExceeded if exceeded), so the counter
reflects actual equality checks rather than the worst-case length.
In `@src/runtime/api/bun/js_bun_spawn_bindings.rs`:
- Around line 260-265: The code validates args[1..] for cmd.exe metacharacters
but misses argv0 (options.argv0), allowing metachar injection for batch targets;
update the logic around argv0_result/argv so that when is_batch_file (computed
via bun_which::is_batch_file(argv0_result.argv0.as_bytes())), you validate
argv0_result.argv0 (or options.argv0) for shell metacharacters before pushing it
into argv, or postpone pushing argv0 into argv until after the batch-file
metacharacter checks succeed; ensure the same validation used for args[1..] is
applied to argv0_result.argv0 and reference argv0_result, argv, options.argv0,
and is_batch_file in your change.
In `@src/runtime/bake/DevServer.rs`:
- Around line 1475-1478: The pattern matching on `rest` in DevServer.rs accepts
an empty port slice (e.g., `localhost:`) because `port.iter().all(...)` is true
for empty slices; update the match arm that handles `[b':', port @ ..]` to
reject empty ports by requiring `!port.is_empty()` in the guard (so the guard
becomes something like `!port.is_empty() &&
port.iter().all(u8::is_ascii_digit)`), ensuring malformed authorities do not
normalize to allowed hosts used by `is_allowed_dev_host()` /
`is_allowed_dev_origin()`.
In `@src/valkey/valkey_protocol.rs`:
- Around line 702-727: ReplyScanner::scan_one currently masks negative aggregate
lengths by doing u64::try_from(len).unwrap_or(0), which makes inputs like
Attribute "-1" appear as a single pending child and causes scan() to return
NeedMoreData indefinitely; update scan_one to mirror read_value_with_depth's
validation: after calling reader.read_integer() check if len < 0 and return the
appropriate error (e.g., RedisError::InvalidAttribute or equivalent for
Map/Array) instead of converting to 0, keep the existing nesting-depth checks
(ValkeyReader::MAX_NESTING_DEPTH) and otherwise convert positive lengths to u64
and apply the same saturating_mul/saturating_add logic used now.
- Around line 407-408: The code uses size_of::<RESPValue>() in calls like
Vec::with_capacity(self.take_prealloc_budget(len, size_of::<RESPValue>())) (seen
near usages in take_prealloc_budget and RESPValue allocation sites) but size_of
is not imported; fix by importing or qualifying the symbol—add a use
std::mem::size_of; at the top of src/valkey/valkey_protocol.rs (or replace calls
with std::mem::size_of::<RESPValue>()) so the Vec::with_capacity(...)
expressions compile.
In `@src/zlib/lib.rs`:
- Around line 517-520: The current check uses self.zlib.total_out >=
self.max_output_size but inflate can still write past the remaining budget in a
single call; update the inflate write path to compute remaining =
self.max_output_size.saturating_sub(self.zlib.total_out as usize) and ensure you
never call inflate with an output buffer larger than remaining — if remaining ==
0 immediately set self.state = ZlibReaderArrayListState::Error and return
Err(ZlibError::ZlibError); otherwise limit the inflater's output slice/length to
remaining and handle a write that would exceed remaining by marking Error and
returning ZlibError::ZlibError so the cap is strictly enforced.
In `@src/zstd/lib.rs`:
- Around line 387-390: The current guard using self.list_ptr.len() >=
self.max_output_size is checked before each loop iteration but a single
ZSTD_decompressStream call can still write past the cap; modify the loop around
ZSTD_decompressStream so you compute remaining =
self.max_output_size.saturating_sub(self.list_ptr.len()) and restrict the
decompression output for this call to at most remaining (e.g. by setting the
output buffer size/available bytes passed into ZSTD_decompressStream or by
truncating/limiting the chunk passed to it); if remaining == 0, set self.state =
State::Error and return Err(ZstdError::ZstdDecompressionError) immediately.
Ensure references to self.list_ptr.len(), self.max_output_size, State::Error and
the call to ZSTD_decompressStream are updated accordingly so no single call can
exceed the cap.
---
Outside diff comments:
In `@src/js/node/wasi.ts`:
- Around line 1591-1598: In path_open's catch block (the one checking instanceof
types_1.WASIError), stop falling through to constants_1.WASI_ESUCCESS for
non-WASI exceptions: after console.error(e) return an appropriate errno such as
constants_1.WASI_EIO (or constants_1.WASI_EINVAL if the error indicates a bad
argument) so unexpected JS/fs errors aren’t treated as success; keep the
existing branch that returns e.errno for types_1.WASIError unchanged.
In `@src/jsc/bindings/webcore/SerializedScriptValue.cpp`:
- Around line 4648-4667: The code reads pemSize as a uint64_t and later passes
it to BIO_new_mem_buf via read(&rawBio, pemSize), which can overflow because
BIO_new_mem_buf takes an int; add an explicit check that pemSize <=
std::numeric_limits<int>::max() (and > 0 if desired) before calling read/BIO
creation and reject (call fail() and return JSValue()) when the size is too
large; update the logic around the read(&rawBio, pemSize) call in
SerializedScriptValue.cpp so oversized PEM payloads are rejected prior to
creating ncrypto::BIOPointer and before calling PEM_read_bio_PUBKEY /
PEM_read_bio_PrivateKey (references: pemSize, read, ncrypto::BIOPointer,
PEM_read_bio_PUBKEY, PEM_read_bio_PrivateKey, KeyObject,
JSPublicKeyObject::create).
In `@src/parsers/yaml.rs`:
- Around line 3175-3196: The EArray branch currently skips non-object items and
loses the original `<<` property; change it to detect any non-object entries and
fall back to preserving the original `<<` property (same behavior as the
scalar/object fallback) instead of silently dropping data. Concretely, in the
match arm handling ast::ExprData::EArray(value_arr) iterate items and if you
encounter a non-object item push the original property into self.list (using
G::Property { key: Some(key), value: Some(value), ..Default::default() }) and
return Ok(()); only call self.merge(...) for arrays where every item is an
object (still returning Ok(()) after merging), and keep using the same
comparisons argument.
In `@src/runtime/shell/states/Expansion.rs`:
- Around line 645-656: The deinit function for Expansion currently clears
out.buf, out.bounds, and current_out but doesn't clear brace_meta_offsets;
update the deinit implementation (inside the pub fn deinit(interp: &Interpreter,
this: NodeId) function) to also clear the Expansion's brace_meta_offsets (use
the local mutable reference `me` from interp.as_expansion_mut(this) and call
clear() on its brace_meta_offsets) so it releases any held allocation
consistently with the other buffers.
---
Duplicate comments:
In `@src/install/bin.rs`:
- Around line 776-780: The current check on target (using
target.split(...).next().is_some_and(|first| first.contains(&b':'))) incorrectly
rejects Unix package bins like "bin/foo:cli"; change the predicate to detect
only Windows drive-letter prefixes (e.g., a two-byte segment where first is
ASCII letter and second is b':'). In other words, keep using target.split(|&b| b
== b'/' || b == b'\\').next() but replace the closure to return true only when
first.len() == 2 && first[1] == b':' && first[0].is_ascii_alphabetic(), so only
drive-prefixed paths are rejected.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8636b4bc-a7ef-4575-8d86-14174b910b5d
📒 Files selected for processing (55)
packages/bun-uws/src/Http3Context.hpackages/bun-uws/src/HttpParser.hsrc/ast/e.rssrc/base64/lib.rssrc/brotli/lib.rssrc/bun_core/util.rssrc/bundler/linker_context/postProcessCSSChunk.rssrc/bundler/linker_context/postProcessJSChunk.rssrc/http/Decompressor.rssrc/http_jsc/websocket_client/WebSocketUpgradeClient.rssrc/ini/lib.rssrc/install/PackageInstaller.rssrc/install/TarballStream.rssrc/install/bin.rssrc/install/isolated_install.rssrc/install/isolated_install/Installer.rssrc/install/lockfile.rssrc/install/lockfile/Package.rssrc/install/lockfile/Package/Scripts.rssrc/js/internal/debugger.tssrc/js/internal/validators.tssrc/js/node/net.tssrc/js/node/tls.tssrc/js/node/wasi.tssrc/jsc/bindings/node/http/NodeHTTPParser.cppsrc/jsc/bindings/sqlite/JSSQLStatement.cppsrc/jsc/bindings/webcore/SerializedScriptValue.cppsrc/libarchive/lib.rssrc/parsers/yaml.rssrc/runtime/api/bun/h2_frame_parser.rssrc/runtime/api/bun/js_bun_spawn_bindings.rssrc/runtime/bake/DevServer.rssrc/runtime/bake/DevServer/ErrorReportRequest.rssrc/runtime/bake/mod.rssrc/runtime/cli/bunx_command.rssrc/runtime/cli/create_command.rssrc/runtime/cli/pm_trusted_command.rssrc/runtime/crypto/pwhash.rssrc/runtime/server/server_body.rssrc/runtime/shell/states/Cmd.rssrc/runtime/shell/states/Expansion.rssrc/runtime/valkey_jsc/js_valkey.rssrc/runtime/valkey_jsc/valkey.rssrc/runtime/webcore/Blob.rssrc/semver/SemverQuery.rssrc/shell_parser/parse.rssrc/sql/mysql/protocol/StackReader.rssrc/sql_jsc/mysql/MySQLConnection.rssrc/sql_jsc/postgres/PostgresSQLConnection.rssrc/sys/lib.rssrc/valkey/valkey_protocol.rssrc/which/lib.rssrc/zlib/lib.rssrc/zstd/lib.rstest/js/node/http/early-hints-crlf-injection.test.ts
50fe071 to
c041aa6
Compare
Follow-up to #31129. 46 small, independent commits tightening input validation, bounds checks, size limits, and resource lifetimes across:
Each change is the minimal validation or bound needed at that site; no behavioral changes for well-formed inputs.
cargo checkclean on all 10 cross-platform targets